A framework for sample size calculations in longitudinal surveys to measure net and gross changes

In longitudinal surveys, repeated measurements are collected from the same sample units over time to measure gross change (i.e., change at the level of individual sample members). Longitudinal samples are sometimes supplemented by fresh sample to measure net change (i.e., change at the aggregate level). That is, in each measurement wave, while one part of the sample is newly recruited (fresh), another part overlaps with previously interviewed sample (repeated interviews). Many aspects of survey design of longitudinal surveys have been studied extensively, such as definition of target population, sample design, survey weighting, intervals between interviews, nonresponse, and panel attrition. Although the impact of the overlap between samples on the statistical power has been studied, sample size determination lacks a formulation that takes account of these factors in longitudinal surveys that aim to measure net and gross changes simultaneously. In this study, we propose a framework for sample size calculation to measure net and gross changes in estimated means or proportions concurrently in longitudinal surveys. We present a framework to compute panel and fresh sample sizes for varying levels of net and gross change. Finally, we illustrate the framework using nchange, an R package we developed to execute the algorithm of the proposed framework. The framework and the R package will support researchers to determine sample sizes targeting specific power of analysis with respect to measuring net and gross changes in rotating- or split-panel surveys.

Duncan and Kalton [7] outlined the analytical properties of rotating-and split-panel survey designs along with those of other designs.For example, rotating-and split-panel designs allow estimating net change (i.e., change at the aggregate level) in addition to gross change (i.e., change at the element level between two time points simultaneously).At the same time, repeated measurements allow for research on the dynamics of causation and relationships ( [8], p. 470).For example, Johnson, Pence, and Vine [9] used panel data from the SoC to investigate the impact of consumers' assessments of vehicle financing conditions on new car purchases.Furthermore, with such overlapped repeated measurements designs, causality and growth patterns can be modeled using hazard, survival, and latent growth models [6].Finally, apart from measuring changes, longitudinal surveys allow for cumulating data across time intervals, for example monthly or annual surveys can be combined yielding increased sample sizes for subgroups of interest [7].
In this research note, we focus on (1) net change in estimated means or proportions (difference between estimates from two samples of two survey waves), and (2) gross change in estimated means or proportions (difference between estimates only from the overlapped samples of two survey waves).Although the net change estimates provide information on stability at the aggregate level, a number of studies showed that such information is not enough especially for public policy decision makers who need information on the gross change as well [10].For example, although the U.S. Census Bureau reported little net change in poverty rate in the 1970s, results from the Panel Study of Income Dynamics indicate a gross decline in poverty rate around the same period (i.e., the same families did not stay poor over the same period) [11][12][13][14][15].
Sample size for probability samples is determined using precision analysis or power analysis [16][17][18].In precision analysis, sample size is determined to estimate an unknown parameter, such as an estimated proportion, mean, odds ratio, or relative risk, with a prespecified precision at a fixed significance level.Prespecified values of standard error, margin of error, or coefficient of variation are typically used as indicators for precision.In power analysis, sample size is determined to achieve a desired power for detecting changes in estimated parameters at a fixed significance level.These changes might be differences in estimated means, proportions, odds ratios, or relative risks.The estimated parameters are typically measured from two independent samples-control and treatment groups-based on an experimental design or baseline and intervention surveys.Our colleagues [17] addressed a special case where sample size is determined to achieve a desired power for detecting changes between estimates that are measured from two overlapped samples.Because the portion of the changes across waves is measured based on the same sample units, the required sample size for longitudinal surveys is lower than that of repeated cross-sectional surveys.Although the literature focuses on showing the change in precision of net change estimates related to the intraclass correlations of repeated measurements [7], these general approaches to design with overlapping units cite objectives as measures of level and net change but do not incorporate the gross change precision determination.
According to lynn [6], "There are some aspects of survey design that are unique to longitudinal surveys, or are substantially different in nature to cross-sectional surveys.Standard survey methods textbooks provide little or no guidance on how to make design decisions regarding these aspects.Yet these aspects warrant careful consideration as design decisions can have weighty implications for data quality and for analysis possibilities" (p.11).Lynn [6] provides a good summary for those aspects, such as definition of target population, sample design, weighting, intervals between waves, nonresponse, and panel attrition.Unfortunately, although sample size calculation is a key element in designing sample surveys, this aspect needs more attention with respect to longitudinal surveys, especially when several survey objectives need to be addressed.This research note addresses a common scenario where measuring net and gross change are two key objectives of a longitudinal survey and can follow either a rotating-or split-panel design.In this research note, we outline a framework to compute sample sizes that focuses on the dual objectives of net and gross change estimates.We also introduce nchange, an R package we developed to execute the algorithm of the proposed framework.In the next section, we review the calculation of sample size to detect net and gross change.We then introduce a framework for calculating the sample size for detecting both changes.Finally, we present an illustration for the execution of the framework using the R package.

Sample size calculation to detect net and gross changes
In this section, we review the sample size calculation for detecting net and gross changes between two time points, t and t+1, using samples from a split or a rotating panel design.Let s t and s t+1 denote two samples of size n t and n t+1 , where s 10 denotes a sample subset of n 10 units with data collected only at time t, s 11 denote a sample subset of n 11 units with data collected at times t and t+1, and s 01 denote a sample subset of n 01 units with data collected at time t+1 only.This structure implies that s t = s 10 [ s 11 and s t+1 = s 01 [ s 11 .

Net change
When both s t and s t+1 are simple random samples and y i and x i represent the attribute measured at t and t+1 for the i th unit, the net change in means can be written as and the variance of � d is where s 2 x and s 2 y denote population variance of X and Y, and σ xy denotes the population covariance of X and Y.The required sample size to detect a net change δ can be calculated as where r = n t /n t+1 , γ = n 11 /n t , ρ = σ xy /σ x σ y , Z 1−α/2 is the 100(1α/2) th percentile normal distribution that corresponds to the desired level of significance (α) for 2-sided test and Z β corresponds to the desired power of test (1 − β) for the net change.Eq (3) is the two-sided test version of equation (4.13) in [17].
When the study variable is a binary outcome with 0/1 values, the difference in means reduces to a difference in proportions P t − P t+1 , and the required sample size to detect a net change δ can be calculated as where P XY denotes the proportion of units in s 11 with the same study characteristic in t and t +1.See the proof of (4) in Appendix A in S1 Appendix.
As is clear from Eqs (3) and ( 4), the more correlation of ρ or overlap of γ, the less sample size n t is required to detect the same net change δ.This implies that increasing the panel share γ can simultaneously increase statistical power for detecting changes while reducing the required sample size (i.e., the costs of conducting the survey).Previous research focused on incorporating this characteristic in computing precision for net change using rotating panel or split panel designs [6,7].

Gross change
Unlike with net change, units that existed at only one of the two time points, s 10 and s 01 , do not contribute to the calculation of gross change.Only units that are measured in the two time points contribute to gross change.Gross change Δ is defined based on s 11 only as below where s 2 x ¼ s 2 y � s 2 o , the sample size of n 11 can be calculated as where � Z When the study variable is a binary outcome with 0/1 values, gross change can be measured as a difference between proportions, � P t and � P tþ1 , calculated based on n 11 as 1 and the required sample size to detect a gross change Δ can be calculated as Both Eqs ( 6) and ( 7) can be easily derived from Eqs (3) and (4), respectively, as r = 1 and γ = 1.Depending on available data, we need to think about different scenarios to compute sample sizes to measure net and gross changes concurrently.The amount of available information imposes a different problem parameterization as we advance from t to t+1 in data collection.For example, before time t, the problem is to find n t , n 11 and n t+1 of samples s t and s t+1 , whereas between times t and t+1, the problem is to solve for n 11 and n t+1 of sample s t+1 given the available n t from sample s t .In the later scenario, when n t is known, the required sample size n t+1 to detect a net change δ in a continuous variable can be calculated as where See Appendix A in S1 Appendix for a proof of Eq (8).Eq (9) is based on Eq (6).To detect a net change δ in proportions of binary variable, the following equations can be used: where See Appendix A in S1 Appendix for a proof of Eq (10).Eq (11) is based on Eq (7).Unfortunately, finding similar closed form formulas when n t , n 11 , and n t+1 are unknown is not easy (see Appendix B in S1 Appendix for details).Therefore, as illustrated in the next section, we propose a sequential algorithm to simulate the achieved powers using different scenarios of n t , n 11 , and n t+1 and choose the final sample size accordingly.

Proposed framework for sample size calculations under dual goal
In this section, we propose a framework to find n t and n 11 that are enough to (1) achieve the desired power (1 − β) for detecting net change δ at a significance level α, and ( 2 power 2. Find the required n t and n t+1 to measure net change δ at a significance level α and (1 − β) power using a sequential power analysis as follows: 2.1 Set a lower bound for n 10 as where 0 � θ < 1.When θ is set to 0, the lower bound of n 10 is set to 0; when θ is set to 0.5, the lower bound of n 10 is set to equal n 11 .
2.2 Set the iteration indicator as i = 0.

Set the sequential distance of iterations as
2.4 Set an initial value of n t as n i = n 11 + i + j.
2.5 Approximate the power of detecting the net change δ at a significance level α using n t as below ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where n t+1 is set to be equal to n t , and γ i = n 11 /n i .
3. Perform a power assessment as follows: If Z i � Z β , stop and use n 11 and n t = n i .Otherwise, rerun steps 2.3 to 2.5.where RR 11 and RR 10 are completion rates among panel and fresh sampling units respectively.

Illustration
We designed the nchange R package to execute the algorithm of the proposed framework [19].The main functions "seqmeans" and "seqprop" are designed to execute the proposed algorithms for means and proportions, respectively.The package also accommodates the case of complex sample design by inflating the calculated sample size by an approximated design effect (deff).See Appendix C in S1 Appendix for more details about the function arguments.

Illustration of sample size for two samples based on differences in proportions
Similarly, we can find n t , n 11 , and n t+1 to measure changes in proportions using the seqprop function.When the net proportions are P t = 0.50, P t+1 = 0.70, and P XY = 0.45, and the gross proportions are � P t ¼ 0:50; � P tþ1 ¼ 0:80, and � P XY ¼ 0:45 with significance and power level similar to the previous example, we can use the following function:

Conclusion
This article presents a framework to compute overlapping sample sizes in addition to total sample sizes for two consecutive waves targeting specific magnitudes for gross and net change simultaneously in rotating-or split-panel designs.The framework's focus on both net and gross change estimation allows researchers to make informed decisions specifically about both the sample and survey costs.Part of the challenge in large-scale, nationally representative panel surveys is the planning of efforts to sustain a sample that is representative of the population [12].The efforts to sustain a representative panel sample are an important part of the success of the panel surveys, and this framework provides a foundation to plan the efforts.
Despite the increasing use of panel surveys and studies employing implicit rotating sample designs and acknowledging their ability to measure change within individuals and over time across all individuals, these survey designs often lack simultaneous power calculations for both aspects [20].The nchange R package aims to address this gap by providing practitioners with the tools to incorporate power calculations in designing such studies.With this package, researchers can better assess the statistical power of their studies for varying levels of sample sizes, and give informed decisions related to survey costs.
1À � a=2 corresponds to the desired level of significance (� a) for the gross change, and � Z � b corresponds to the desired power of test 1 À � b � � for the gross change.
) achieve the desired power 1 À � b � � for detecting gross change Δ at a significance level � a.Because n 11 is one of the determinants of n t , as n t / 1/γ, finding proper values of n t and n 11 should be done simultaneously.In the proposed framework, first we simulate different data structures and then examine the power of detecting net and gross change simultaneously.To simulate the data structures, we use a two-stage power analysis in which a proper n 11 is determined to achieve the desired power 1 À � b � � for detecting gross change Δ at a significance level � a.In the second stage, we sequentially increase n 10 and approximate the achieved power (1 − β) for detecting net change δ at a significance level α.In the proposed framework and illustrations, we assume r = 1 as n t = n t+1 ; however, the framework can accommodate the other scenario of r 6 ¼ 1.As indicated in the flowchart presented in Fig 1, the proposed framework follows the algorithm below: 1. Find the required n 11 to measure gross change Δ at a significance level � a and 1 À � b � �

Fig 3 .
Fig 3. Illustration of sample size for two samples based on differences in proportions.https://doi.org/10.1371/journal.pone.0291449.g003 4. Inflate sample sizes to account for panel attrition and nonresponse as below